Automatically Estimating Emotion in Music with Deep Long-Short Term Memory Recurrent Neural Networks

نویسندگان

  • Eduardo Coutinho
  • George Trigeorgis
  • Stefanos Zafeiriou
  • Björn W. Schuller
چکیده

In this paper we describe our approach for the MediaEval’s “Emotion in Music” task. Our method consists of deep Long-Short Term Memory Recurrent Neural Networks (LSTM-RNN) for dynamic Arousal and Valence regression, using acoustic and psychoacoustic features extracted from the songs that have been previously proven as effective for emotion prediction in music. Results on the challenge test demonstrate an excellent performance for Arousal estimation (r = 0.613 ± 0.278), but not for Valence (r = 0.026 ± 0.500). Issues regarding the quality of the test set annotations’ reliability and distributions are indicated as plausible justifications for these results. By using a subset of the development set that was left out for performance estimation, we could determine that the performance of our approach may be underestimated for Valence (Arousal: r = 0.596 ± 0.386; Valence: r = 0.458 ± 0.551).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

Multi-Scale Approaches to the MediaEval 2015 "Emotion in Music" Task

The goal of the “Emotion in Music” task in MediaEval 2015 is to automatically estimate the emotions expressed by music (in terms of Arousal and Valence) in a time-continuous fashion. In this paper, considering the high context correlation among the music feature sequence, we study several multiscale approaches at different levels, including acoustic feature learning with Deep Brief Networks (DB...

متن کامل

Integration of remote sensing and meteorological data to predict flooding time using deep learning algorithm

Accurate flood forecasting is a vital need to reduce its risks. Due to the complicated structure of flood and river flow, it is somehow difficult to solve this problem. Artificial neural networks, such as frequent neural networks, offer good performance in time series data. In recent years, the use of Long Short Term Memory networks hase attracted much attention due to the faults of frequent ne...

متن کامل

The Munich LSTM-RNN Approach to the MediaEval 2014 "Emotion in Music'" Task

In this paper we describe TUM’s approach for the MediaEval’s “Emotion in Music” task. The goal of this task is to automatically estimate the emotions expressed by music (in terms of Arousal and Valence) in a time-continuous fashion. Our system consists of Long-Short Term Memory Recurrent Neural Networks (LSTM-RNN) for dynamic Arousal and Valence regression. We used two different sets of acousti...

متن کامل

Prediction of Covid-19 Prevalence and Fatality Rates in Iran Using Long Short-Term Memory Neural Network

Introduction: The rapid spread of COVID-19 has become a critical threat to the world. So far, millions of people worldwide have been infected with the disease. The Covid-19 pandemic has had significant effects on various aspects of human life. Currently, prediction of the virus's spread is essential in order to be safe and make necessary arrangements. It can help control the rate of its outbrea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015